๊ฒฝ๊ธฐ๋„ ์ง€์—ญ ๋‚ด ๋“ฑ๋ก๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜ ์ฝ”๋กœํ”Œ๋ ˆ์Šค ๋งต

Published

October 27, 2022

๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ž„ํฌํŠธ

Code
import numpy as np
import pandas as pd
import json
import folium
from branca.colormap import linear
print('folium version: ', folium.__version__)
folium version:  0.13.0

๊ทธ๋ž˜ํ”„ ํ•œ๊ธ€ ๊ธ€๊ผด ๊นจ์ง ๋ฐฉ์ง€ ์ฝ”๋“œ

Code
import matplotlib.pyplot as plt
import seaborn as sns

# ํ•œ๊ธ€ํฐํŠธ ์ฒ˜๋ฆฌ 
plt.rcParams['font.size'] = 11.0
plt.rcParams['font.family'] = ['Malgun Gothic', 'AppleGothic']

# ๊ทธ๋ž˜ํ”„์—์„œ ๋งˆ์ด๋„ˆ์Šค ํฐํŠธ ๊นจ์ง€๋Š” ๋ฌธ์ œ์— ๋Œ€ํ•œ ๋Œ€์ฒ˜
plt.rcParams['axes.unicode_minus'] = False

๋ฐ˜๋ ค๋™๋ฌผ ๋“ฑ๋กํ˜„ํ™ฉ ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ๋‚ด์šฉ ํ™•์ธํ•˜๊ธฐ

Code
pet = pd.read_csv('data/๋ฐ˜๋ ค๋™๋ฌผ๋“ฑ๋กํ˜„ํ™ฉ.csv')

print('pet data shape: ', pet.shape)
pet data shape:  (729, 14)
Code
pet.head(2)
์‹œ๊ตฐ๋ช… ์๋ฉด๋™๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ) (๋“ฑ๋ก์ฃผ์ฒด)์‹œ๊ตฐ๊ตฌ๋“ฑ๋ก (๋“ฑ๋ก์ฃผ์ฒด)๋Œ€ํ–‰์—…์ฒด๋“ฑ๋ก (๋“ฑ๋ก์ฃผ์ฒด)๊ธฐํƒ€ (RFID์ข…๋ฅ˜)๋‚ด์žฅํ˜• (RFID์ข…๋ฅ˜)์™ธ์žฅํ˜• (RFID์ข…๋ฅ˜)์ธ์‹ํ‘œ ๋“ฑ๋กํ’ˆ์ข…์ˆ˜ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๋™๋ฌผ์†Œ์œ ์ž๋‹น๋“ฑ๋ก๋™๋ฌผ์ˆ˜ ํ•ด๋‹น๋™์˜๋“ฑ๋ก๋Œ€ํ–‰์—…์ฒด์ˆ˜ ๋ฐ์ดํ„ฐ๊ธฐ์ค€์ผ์ž
0 ๊ฐ€ํ‰๊ตฐ ๊ฐ€ํ‰์ 941 NaN NaN NaN 596 294 51 NaN 85.0 NaN 3.0 2022-06-14
1 ๊ฐ€ํ‰๊ตฐ ๋ถ๋ฉด 289 NaN NaN NaN 176 93 20 NaN 185.0 NaN 0.0 2022-06-14
Code
pet['๋ฐ์ดํ„ฐ๊ธฐ์ค€์ผ์ž'] = pd.to_datetime(pet['๋ฐ์ดํ„ฐ๊ธฐ์ค€์ผ์ž'])
print('๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๊ธฐ๊ฐ„: ', pet['๋ฐ์ดํ„ฐ๊ธฐ์ค€์ผ์ž'].dt.year.unique())
๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๊ธฐ๊ฐ„:  [2022 2020 2021]

ํ•„์š”ํ•œ ์ปฌ๋Ÿผ๋งŒ ์„ ํƒ

Code
print('์ „์ฒด ์ปฌ๋Ÿผ๋ช…: ', pet.columns)
์ „์ฒด ์ปฌ๋Ÿผ๋ช…:  Index(['์‹œ๊ตฐ๋ช…', '์๋ฉด๋™๋ช…', '๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)', '(๋“ฑ๋ก์ฃผ์ฒด)์‹œ๊ตฐ๊ตฌ๋“ฑ๋ก', '(๋“ฑ๋ก์ฃผ์ฒด)๋Œ€ํ–‰์—…์ฒด๋“ฑ๋ก', '(๋“ฑ๋ก์ฃผ์ฒด)๊ธฐํƒ€',
       '(RFID์ข…๋ฅ˜)๋‚ด์žฅํ˜•', '(RFID์ข…๋ฅ˜)์™ธ์žฅํ˜•', '(RFID์ข…๋ฅ˜)์ธ์‹ํ‘œ', '๋“ฑ๋กํ’ˆ์ข…์ˆ˜', '๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜',
       '๋™๋ฌผ์†Œ์œ ์ž๋‹น๋“ฑ๋ก๋™๋ฌผ์ˆ˜', 'ํ•ด๋‹น๋™์˜๋“ฑ๋ก๋Œ€ํ–‰์—…์ฒด์ˆ˜', '๋ฐ์ดํ„ฐ๊ธฐ์ค€์ผ์ž'],
      dtype='object')
Code
print('์‚ฌ์šฉํ•  ์ปฌ๋Ÿผ๋งŒ ๊ณจ๋ผ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์— ๋ฎ์–ด ์”Œ์›€')
pet = pet.loc[ :, ['์‹œ๊ตฐ๋ช…', '์๋ฉด๋™๋ช…', '๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)', '๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜']]
pet.head()
์‚ฌ์šฉํ•  ์ปฌ๋Ÿผ๋งŒ ๊ณจ๋ผ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์— ๋ฎ์–ด ์”Œ์›€
์‹œ๊ตฐ๋ช… ์๋ฉด๋™๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ) ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜
0 ๊ฐ€ํ‰๊ตฐ ๊ฐ€ํ‰์ 941 85.0
1 ๊ฐ€ํ‰๊ตฐ ๋ถ๋ฉด 289 185.0
2 ๊ฐ€ํ‰๊ตฐ ์ƒ๋ฉด 399 243.0
3 ๊ฐ€ํ‰๊ตฐ ์„ค์•…๋ฉด 1111 625.0
4 ๊ฐ€ํ‰๊ตฐ ์กฐ์ข…๋ฉด 416 274.0

๊ฒฐ์ธก์น˜ ํ™•์ธ

  • ๊ฒฐ์ธก์น˜๊ฐ€ ์žˆ๋Š” ํ–‰ ์ถœ๋ ฅํ•ด์„œ ๋”ฐ๋กœ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•  ์‚ฌํ•ญ์ด ์žˆ๋Š”์ง€ ํ™•์ธ
  • df.isna().sum() -> ์—ด๋ณ„๋กœ ๊ฒฐ์ธก์น˜ ๊ฐœ์ˆ˜ ์ถœ๋ ฅ
  • [๊ฒฐ์ธก์น˜ ์žˆ๋Š” ์—ด].isna() ๋“ค๋ฅผ |(or)๋กœ ๋ฌถ์–ด boolean indexing
Code
pet.isna().sum()
์‹œ๊ตฐ๋ช…          0
์๋ฉด๋™๋ช…         2
๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)    0
๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜       1
dtype: int64
Code
# ๊ฒฐ์ธก์น˜๊ฐ€ ๋ถ„์„์— ํฐ ์˜ํ–ฅ์„ ์ฃผ์ง€ ์•Š์•„ ๊ทธ๋Œ€๋กœ ๋‚จ๊น€
pet[pet['์๋ฉด๋™๋ช…'].isna() | pet['๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜'].isna()]
์‹œ๊ตฐ๋ช… ์๋ฉด๋™๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ) ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜
106 ๊ตฐํฌ์‹œ NaN 15480 12226.0
537 ์˜์ •๋ถ€์‹œ NaN 58997 NaN

์‹œ๊ตฐ ๋‹จ์œ„๋กœ ๋ถ„์„ ๋ฒ”์œ„ ์ œํ•œ groupby

  • ์‹œ๊ฐ„์ด ๋ถ€์กฑํ•ด ์๋ฉด๋™ ๋‹จ์œ„๊นŒ์ง€ ๋ฐ์ดํ„ฐ๋ฅผ ์ •์ œํ•˜๊ณ  ์‹œ๊ฐํ™”ํ•˜๊ธฐ ์–ด๋ ค์›Œ ์‹œ๊ตฐ ๋‹จ์œ„๊นŒ์ง€๋กœ ๋ถ„์„ ๋ฒ”์œ„๋ฅผ ์ œํ•œํ•จ
  • ์ง€๋„์— ๋“ฑ๋ก๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜ ๋งˆ์ปค๋ฅผ ํ‘œ์‹œํ•˜๋Š” ์ง€์ ์„ ์‹œ๊ตฐ์ฒญ ์œ„์น˜๋กœ ์ •ํ•จ
  • ๊ด€๊ณต์„œ ์œ„๊ฒฝ๋„ ํŒŒ์ผ๊ณผ ๋ณ‘ํ•ฉํ•˜๊ธฐ ์œ„ํ•ด ์‹œ๊ตฐ์ฒญ๋ช…์„ ๋งŒ๋“ค์–ด ๋ณ„๋„์˜ ์ปฌ๋Ÿผ์œผ๋กœ ๋ถ™์ž„
Code
print('์ „์ฒด ์๋ฉด๋™ ๊ฐœ์ˆ˜: ', len(pet['์๋ฉด๋™๋ช…'].unique()))
์ „์ฒด ์๋ฉด๋™ ๊ฐœ์ˆ˜:  675
Code
print(pet['์‹œ๊ตฐ๋ช…'].unique())
print('์‹œ๊ตฐ ๊ฐœ์ˆ˜: ', len(pet['์‹œ๊ตฐ๋ช…'].unique()))
['๊ฐ€ํ‰๊ตฐ' '๊ณ ์–‘์‹œ' '๊ณผ์ฒœ์‹œ' '๊ด‘๋ช…์‹œ' '๊ด‘์ฃผ์‹œ' '๊ตฌ๋ฆฌ์‹œ' '๊ตฐํฌ์‹œ' '๊น€ํฌ์‹œ' '๋‚จ์–‘์ฃผ์‹œ' '๋™๋‘์ฒœ์‹œ' '๋ถ€์ฒœ์‹œ' '์„ฑ๋‚จ์‹œ'
 '์ˆ˜์›์‹œ' '์‹œํฅ์‹œ' '์•ˆ์‚ฐ์‹œ' '์•ˆ์„ฑ์‹œ' '์•ˆ์–‘์‹œ' '์–‘์ฃผ์‹œ' '์–‘ํ‰๊ตฐ' '์—ฌ์ฃผ์‹œ' '์—ฐ์ฒœ๊ตฐ' '์˜ค์‚ฐ์‹œ' '์šฉ์ธ์‹œ' '์˜์™•์‹œ'
 '์˜์ •๋ถ€์‹œ' '์ด์ฒœ์‹œ' 'ํŒŒ์ฃผ์‹œ' 'ํ‰ํƒ์‹œ' 'ํฌ์ฒœ์‹œ' 'ํ•˜๋‚จ์‹œ' 'ํ™”์„ฑ์‹œ']
์‹œ๊ตฐ ๊ฐœ์ˆ˜:  31
  • ์‹œ๊ตฐ ๋‹จ์œ„๋กœ groupby -> ์‹œ๊ตฐ๋ณ„ ๋“ฑ๋ก ๋ฐ˜๋ ค๋™๋ฌผ ์ˆ˜ ํ•ฉ๊ณ„
  • ์‹œ๊ตฐ๋ณ„ ๋“ฑ๋ก๋™๋ฌผ์ˆ˜์™€ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ƒ์„ฑ
  • ์•„๋ž˜ ๊ด€๊ณต์„œ ๋ฐ์ดํ„ฐ์™€ ๋ณ‘ํ•ฉ ์œ„ํ•ด reset_index
Code
pet_group = pet.groupby('์‹œ๊ตฐ๋ช…')
pet_num = pet_group[['๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)', '๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜']].sum()
pet_num = pet_num.reset_index()
pet_num.head(5)
์‹œ๊ตฐ๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ) ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜
0 ๊ฐ€ํ‰๊ตฐ 4017 2006.0
1 ๊ณ ์–‘์‹œ 73477 54580.0
2 ๊ณผ์ฒœ์‹œ 2974 2325.0
3 ๊ด‘๋ช…์‹œ 20161 15698.0
4 ๊ด‘์ฃผ์‹œ 29042 19368.0

๊ฐ€๋กœ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ์‹œ๊ฐํ™”

  • ๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ) ๊ธฐ์ค€ ๋‚ด๋ฆผ์ฐจ์ˆœ์œผ๋กœ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ •๋ ฌ: df.sort_values(โ€˜์—ด์ด๋ฆ„โ€™)
  • ๊ฐ€๋กœ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„: bars = ax.barh(indexes, values)
  • ๋ง‰๋Œ€ ๊ทธ๋ž˜ํ”„ ๋์— ๊ฐ’ ํ‘œ์‹œ: ax.bar_label(bars)
  • ๊ฐ’์—๋Š” 1000 ๋‹จ์œ„๋งˆ๋‹ค ์‰ผํ‘œ ์ถ”๊ฐ€: ax.bar_label(bars, labels=[fโ€™{x:,.0f}โ€™ for x in bars.datavalues])
Code
pet_graph = pet_num.sort_values('๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)')

fig, ax = plt.subplots(figsize=(10,7))
barh = ax.barh(pet_graph['์‹œ๊ตฐ๋ช…'], pet_graph['๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)'])
ax.bar_label(barh, labels=[f'{x:,.0f}' for x in barh.datavalues])
ax.set_xlim(right=130000) # x์ถ• ์ตœ๋Œ€๊ฐ’ ์„ค์ •. ์„ค์ •ํ•˜์ง€ ์•Š์œผ๋ฉด ๋ถ€์ฒœ์‹œ ์ˆซ์ž๊ฐ’์ด ํ…Œ๋‘๋ฆฌ ๋ฐ–์œผ๋กœ ํŠ€์–ด๋‚˜์˜ด
plt.show()

  • bars ์•ˆ์—๋Š” xy์ขŒํ‘œ ํŠœํ”Œ, ๋ง‰๋Œ€ ๋„“์ด(width, ์—ฌ๊ธฐ์„œ๋Š” ๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜), ๋ง‰๋Œ€ ๋†’์ด(height, ์—ฌ๊ธฐ์„œ๋Š” ๋ง‰๋Œ€๋‘๊ป˜), angle์€ 0๋„ ๊ฐ’์ด ๋‹ด๊ฒจ ์žˆ๋‹ค.
Code
for bar in barh:
    print(bar)
Rectangle(xy=(0, -0.4), width=2020, height=0.8, angle=0)
Rectangle(xy=(0, 0.6), width=2974, height=0.8, angle=0)
Rectangle(xy=(0, 1.6), width=4017, height=0.8, angle=0)
Rectangle(xy=(0, 2.6), width=5252, height=0.8, angle=0)
Rectangle(xy=(0, 3.6), width=6590, height=0.8, angle=0)
Rectangle(xy=(0, 4.6), width=6998, height=0.8, angle=0)
Rectangle(xy=(0, 5.6), width=7912, height=0.8, angle=0)
Rectangle(xy=(0, 6.6), width=11168, height=0.8, angle=0)
Rectangle(xy=(0, 7.6), width=11848, height=0.8, angle=0)
Rectangle(xy=(0, 8.6), width=12381, height=0.8, angle=0)
Rectangle(xy=(0, 9.6), width=13147, height=0.8, angle=0)
Rectangle(xy=(0, 10.6), width=13291, height=0.8, angle=0)
Rectangle(xy=(0, 11.6), width=14471, height=0.8, angle=0)
Rectangle(xy=(0, 12.6), width=15480, height=0.8, angle=0)
Rectangle(xy=(0, 13.6), width=17634, height=0.8, angle=0)
Rectangle(xy=(0, 14.6), width=19950, height=0.8, angle=0)
Rectangle(xy=(0, 15.6), width=20161, height=0.8, angle=0)
Rectangle(xy=(0, 16.6), width=24057, height=0.8, angle=0)
Rectangle(xy=(0, 17.6), width=29042, height=0.8, angle=0)
Rectangle(xy=(0, 18.6), width=30539, height=0.8, angle=0)
Rectangle(xy=(0, 19.6), width=35644, height=0.8, angle=0)
Rectangle(xy=(0, 20.6), width=42957, height=0.8, angle=0)
Rectangle(xy=(0, 21.6), width=48264, height=0.8, angle=0)
Rectangle(xy=(0, 22.6), width=50227, height=0.8, angle=0)
Rectangle(xy=(0, 23.6), width=55728, height=0.8, angle=0)
Rectangle(xy=(0, 24.6), width=58997, height=0.8, angle=0)
Rectangle(xy=(0, 25.6), width=60468, height=0.8, angle=0)
Rectangle(xy=(0, 26.6), width=65330, height=0.8, angle=0)
Rectangle(xy=(0, 27.6), width=73477, height=0.8, angle=0)
Rectangle(xy=(0, 28.6), width=86158, height=0.8, angle=0)
Rectangle(xy=(0, 29.6), width=113892, height=0.8, angle=0)

์„ธ๋กœ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„

  • ์„ธ๋กœ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„์ผ ๋•Œ๋Š” ๋ง‰๋Œ€ ๋†’์ด(height)์— ๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜๊ฐ€ ๋‹ด๊ฒจ ์žˆ๊ณ , angle์€ 0์œผ๋กœ ๊ฐ™๋‹ค.
Code
pet_graph = pet_num.sort_values('๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)', ascending=False)
fig, ax = plt.subplots(figsize=(15,7))
bar = ax.bar(pet_graph['์‹œ๊ตฐ๋ช…'], pet_graph['๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)'])
ax.bar_label(bar, labels=[f'{x:,.0f}' for x in bar.datavalues]) # ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„ ๊ฐ’์„ 30๋„๋กœ ๋Œ๋ฆฌ๋Š” ๋ฐฉ๋ฒ•์€???
plt.setp(ax.get_xticklabels(), rotation=30) # x์ถœ ๋ ˆ์ด๋ธ” 30๋„๋กœ ๋Œ๋ฆฌ๊ธฐ
plt.show()

๊ด€๊ณต์„œ ์œ„๊ฒฝ๋„ ์ถ”๊ฐ€

  • ์ง€๋„์˜ ์‹œ๊ตฐ์ฒญ์‚ฌ ์œ„์น˜์— ๋ฐ˜๋ ค๋™๋ฌผ ์ˆซ์ž๋ฅผ ํ‘œ์‹œํ•˜๊ธฐ๋กœ ํ•จ.
  • ๊ทธ๋Ÿฌ๊ธฐ ์œ„ํ•ด์„œ๋Š” ์‹œ๊ตฐ์ฒญ์‚ฌ ์œ„๋„, ๊ฒฝ๋„ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ์กด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ๊ธฐ์ค€์œผ๋กœ ๋ณ‘ํ•ฉํ•ด์•ผ ํ•จ.
  • csv ํŒŒ์ผ์— ์œ„๊ฒฝ๋„ ๋ฐ์ดํ„ฐ๊ฐ€ ์—†๋Š” ๊ณณ์€ ๋‹ค์šธ ์ฃผ์†Œ๋ณ€ํ™˜ ์‚ฌ์ดํŠธ์—์„œ ๊ฐ€์ ธ์˜ด
Code
gov = pd.read_csv('data/๊ฒฝ๊ธฐ๋„์ฒญ์‚ฌ๋ฐ์ถœ์žฅ์†Œํ˜„ํ™ฉ.csv')
gov.head(3)
์ง‘๊ณ„์ผ์ž ์‹œ๊ตฐ๋ช… ๊ตฌ๋ถ„๋ช… ์ „ํ™”๋ฒˆํ˜ธ์•ˆ๋‚ด ์†Œ์žฌ์ง€์šฐํŽธ๋ฒˆํ˜ธ ์†Œ์žฌ์ง€๋„๋กœ๋ช…์ฃผ์†Œ ์†Œ์žฌ์ง€์ง€๋ฒˆ์ฃผ์†Œ WGS84์œ„๋„ WGS84๊ฒฝ๋„
0 2022-05-06 ๊ด‘๋ช…์‹œ ์ฒ ์‚ฐ2๋™ 02-2680-6609 14215 ๊ฒฝ๊ธฐ๋„ ๊ด‘๋ช…์‹œ ์‹œ์ฒญ๋กœ 61 ๊ฒฝ๊ธฐ๋„ ๊ด‘๋ช…์‹œ ์ฒ ์‚ฐ๋™ 160-4๋ฒˆ์ง€ 37.484689 126.866622
1 2022-05-06 ๊ด‘๋ช…์‹œ ์ผ์ง๋™ 02-2680-5800 14345 ๊ฒฝ๊ธฐ๋„ ๊ด‘๋ช…์‹œ ์–‘์ง€๋กœ 19 6์ธต ๊ฒฝ๊ธฐ๋„ ๊ด‘๋ช…์‹œ ์ผ์ง๋™ 512-3๋ฒˆ์ง€ 6์ธต 37.418938 126.882693
2 2022-04-16 ์˜ค์‚ฐ์‹œ ๋‚จ์ดŒ๋™ํ–‰์ •๋ณต์ง€์„ผํ„ฐ 031-8036-6260 18119 ๊ฒฝ๊ธฐ๋„ ์˜ค์‚ฐ์‹œ ์ฒญํ•™๋กœ 55-5 (์ฒญํ•™๋™) ๊ฒฝ๊ธฐ๋„ ์˜ค์‚ฐ์‹œ ์ฒญํ•™๋™ 16-6 37.154587 127.063455

๋ฐ˜๋ ค๋™๋ฌผ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„๊ณผ left join

  • ๊ธฐ์กด pet_num_si ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„๊ณผ ์œ„๊ฒฝ๋„๊ฐ€ ์žˆ๋Š” gov ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ left join
  • join ๊ธฐ์ค€ ์—ด์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋งž์ถ”๊ธฐ ์œ„ํ•ด ์‹œ๊ตฐ๋ช… ๋’ค์— โ€˜์ฒญโ€™ ๋‹จ์–ด๋ฅผ ๋ถ™์ž„
Code
pet_num['๊ด€๊ณต์„œ'] = pet_num['์‹œ๊ตฐ๋ช…'] + '์ฒญ'
  • ๊ธฐ์กด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ๊ธฐ์ค€์œผ๋กœ left join ํ›„ ์ปฌ๋Ÿผ๋ช… ํ™•์ธ
Code
tmp = pet_num.merge(gov, how='left', left_on='๊ด€๊ณต์„œ', right_on='๊ตฌ๋ถ„๋ช…')
tmp.columns
Index(['์‹œ๊ตฐ๋ช…_x', '๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)', '๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜', '๊ด€๊ณต์„œ', '์ง‘๊ณ„์ผ์ž', '์‹œ๊ตฐ๋ช…_y', '๊ตฌ๋ถ„๋ช…', '์ „ํ™”๋ฒˆํ˜ธ์•ˆ๋‚ด',
       '์†Œ์žฌ์ง€์šฐํŽธ๋ฒˆํ˜ธ', '์†Œ์žฌ์ง€๋„๋กœ๋ช…์ฃผ์†Œ', '์†Œ์žฌ์ง€์ง€๋ฒˆ์ฃผ์†Œ', 'WGS84์œ„๋„', 'WGS84๊ฒฝ๋„'],
      dtype='object')
  • ํ•„์š”ํ•œ ์ปฌ๋Ÿผ๋งŒ ๋ฝ‘์•„์„œ ๊ธฐ์กด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์„ ๋ฎ์–ด์”€
Code
pet_num_si = tmp.loc[:, ['์‹œ๊ตฐ๋ช…_x', '๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)', '๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜', '๊ด€๊ณต์„œ', 'WGS84์œ„๋„', 'WGS84๊ฒฝ๋„']]
pet_num_si.head()
์‹œ๊ตฐ๋ช…_x ๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ) ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๊ด€๊ณต์„œ WGS84์œ„๋„ WGS84๊ฒฝ๋„
0 ๊ฐ€ํ‰๊ตฐ 4017 2006.0 ๊ฐ€ํ‰๊ตฐ์ฒญ 37.831318 127.509706
1 ๊ณ ์–‘์‹œ 73477 54580.0 ๊ณ ์–‘์‹œ์ฒญ 37.658422 126.831964
2 ๊ณผ์ฒœ์‹œ 2974 2325.0 ๊ณผ์ฒœ์‹œ์ฒญ 37.429812 126.986963
3 ๊ด‘๋ช…์‹œ 20161 15698.0 ๊ด‘๋ช…์‹œ์ฒญ 37.479097 126.864846
4 ๊ด‘์ฃผ์‹œ 29042 19368.0 ๊ด‘์ฃผ์‹œ์ฒญ 37.429433 127.255084
  • ๋‹ค์‹œ ๊ฒฐ์ธก์น˜ ํ™•์ธ -> ์œ„, ๊ฒฝ๋„์— ๊ฒฐ์ธก์น˜ ์กด์žฌ
Code
pet_num_si.isna().sum()
์‹œ๊ตฐ๋ช…_x        0
๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)    0
๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜       0
๊ด€๊ณต์„œ          0
WGS84์œ„๋„      3
WGS84๊ฒฝ๋„      3
dtype: int64
  • ์œ„๋„, ๊ฒฝ๋„๊ฐ€ ์—†๋Š” ํ–‰๋งŒ ์ถœ๋ ฅ
Code
pet_num_si[ (pet_num_si['WGS84์œ„๋„'].isna()) | (pet_num_si['WGS84๊ฒฝ๋„'].isna()) ]
์‹œ๊ตฐ๋ช…_x ๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ) ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๊ด€๊ณต์„œ WGS84์œ„๋„ WGS84๊ฒฝ๋„
8 ๋‚จ์–‘์ฃผ์‹œ 42957 31542.0 ๋‚จ์–‘์ฃผ์‹œ์ฒญ NaN NaN
10 ๋ถ€์ฒœ์‹œ 113892 86532.0 ๋ถ€์ฒœ์‹œ์ฒญ NaN NaN
30 ํ™”์„ฑ์‹œ 48264 35496.0 ํ™”์„ฑ์‹œ์ฒญ NaN NaN
  • ๋‹ค์˜ฌ ์ฃผ์†Œ๋ณ€ํ™˜ ์‚ฌ์ดํŠธ์—์„œ ์œ„๊ฒฝ๋„๋ฅผ ๊ฐ€์ ธ์™€ ์ž…๋ ฅํ•จ
Code
pet_num_si.iloc[8, [4, 5]] = 37.6366920245, 127.2174958647  # ๋‚จ์–‘์ฃผ1์ฒญ์‚ฌ
pet_num_si.iloc[10, [4, 5]] = 37.5029019, 126.765889 # ๋ถ€์ฒœ์‹œ์ฒญ
pet_num_si.iloc[30, [4, 5]] = 37.1994150, 126.831523  # ํ™”์„ฑ์‹œ์ฒญ
  • ๋‹ค์‹œ ๊ฒฐ์ธก์น˜ ํ™•์ธ -> ์—†์Œ
Code
pet_num_si.isna().sum()
์‹œ๊ตฐ๋ช…_x        0
๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)    0
๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜       0
๊ด€๊ณต์„œ          0
WGS84์œ„๋„      0
WGS84๊ฒฝ๋„      0
dtype: int64

์‚ฌ์šฉํ•˜๊ธฐ ํŽธํ•˜๊ฒŒ ์—ด์ด๋ฆ„(์ปฌ๋Ÿผ๋ช…) ๋ฐ”๊พธ๊ธฐ

Code
pet_num_si = pet_num_si.rename(columns = 
                               {'์‹œ๊ตฐ๋ช…_x':'์‹œ๊ตฐ๋ช…', 
                                '๋“ฑ๋ก๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ)':'๋“ฑ๋ก๋™๋ฌผ์ˆ˜', 
                                'WGS84์œ„๋„':'์œ„๋„', 
                                'WGS84๊ฒฝ๋„':'๊ฒฝ๋„'})
pet_num_si.head()
์‹œ๊ตฐ๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๊ด€๊ณต์„œ ์œ„๋„ ๊ฒฝ๋„
0 ๊ฐ€ํ‰๊ตฐ 4017 2006.0 ๊ฐ€ํ‰๊ตฐ์ฒญ 37.831318 127.509706
1 ๊ณ ์–‘์‹œ 73477 54580.0 ๊ณ ์–‘์‹œ์ฒญ 37.658422 126.831964
2 ๊ณผ์ฒœ์‹œ 2974 2325.0 ๊ณผ์ฒœ์‹œ์ฒญ 37.429812 126.986963
3 ๊ด‘๋ช…์‹œ 20161 15698.0 ๊ด‘๋ช…์‹œ์ฒญ 37.479097 126.864846
4 ๊ด‘์ฃผ์‹œ 29042 19368.0 ๊ด‘์ฃผ์‹œ์ฒญ 37.429433 127.255084

์ธ๊ตฌ ์ •๋ณด ์ถ”๊ฐ€

  • ์ธ๊ตฌ๊ฐ€ ๋งŽ์€ ์ง€์—ญ์— ๋“ฑ๋ก๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜๋„ ๋งŽ๊ธฐ ๋•Œ๋ฌธ์— ์ธ๊ตฌ์ˆ˜๋กœ ๋‚˜๋ˆ  1์ธ๋‹น ๋งˆ๋ฆฌ์ˆ˜๋ฅผ ๊ตฌํ•จ
  • ์ตœ์‹  ์ธ๊ตฌํ˜„ํ™ฉ์ธ 2022๋…„ 9์›” ๋ฐ์ดํ„ฐ๋งŒ ๋‚จ๊น€
  • ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๋‹ˆ ์ค‘๋ณต๊ฐ’์ด ์žˆ๊ณ  ์‹œ๊ตฐ๊ตฌ๊นŒ์ง€ ํ‘œ์‹œ๋œ ํ–‰์ด ํ•˜์œ„ ์๋ฉด๋™์˜ ์ธ๊ตฌ๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ์Œ
  • ๋”ฐ๋ผ์„œ ๋จผ์ € ์ค‘๋ณต๊ฐ’์„ ์‚ญ์ œํ•˜๊ณ 
  • ์‹œ๊ตฐ๊ตฌ๊นŒ์ง€ ํ‘œ์‹œ๋œ ํ–‰๋งŒ ์ถ”๋ ค๋‚ด์•ผ ํ•จ
Code
popu = pd.read_csv('data/์ฃผ๋ฏผ๋“ฑ๋ก์ธ๊ตฌ์ง‘๊ณ„ํ˜„ํ™ฉ.csv')
popu.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 107486 entries, 0 to 107485
Data columns (total 40 columns):
 #   Column       Non-Null Count   Dtype 
---  ------       --------------   ----- 
 0   ์—ฐ๋„           107486 non-null  int64 
 1   ์›”            107486 non-null  int64 
 2   ํ–‰์ •๊ตฌ์—ญ๊ตฌ๋ถ„๋ช…      107486 non-null  object
 3   ํ–‰์ •๊ตฌ์—ญ๋ช…        107486 non-null  object
 4   ์ด ์ธ๊ตฌ์ˆ˜        107486 non-null  int64 
 5   0~9์„ธ         107486 non-null  int64 
 6   10~19์„ธ       107486 non-null  int64 
 7   20~29์„ธ       107486 non-null  int64 
 8   30~39์„ธ       107486 non-null  int64 
 9   40~49์„ธ       107486 non-null  int64 
 10  50~59์„ธ       107486 non-null  int64 
 11  60~69์„ธ       107486 non-null  int64 
 12  70~79์„ธ       107486 non-null  int64 
 13  80~89์„ธ       107486 non-null  int64 
 14  90~99์„ธ       107486 non-null  int64 
 15  100์„ธ ์ด์ƒ      107486 non-null  int64 
 16  ์ด ์ธ๊ตฌ์ˆ˜ (๋‚จ)    107486 non-null  int64 
 17  0~9์„ธ (๋‚จ)     107486 non-null  int64 
 18  10~19์„ธ (๋‚จ)   107486 non-null  int64 
 19  20~29์„ธ (๋‚จ)   107486 non-null  int64 
 20  30~39์„ธ (๋‚จ)   107486 non-null  int64 
 21  40~49์„ธ (๋‚จ)   107486 non-null  int64 
 22  50~59์„ธ (๋‚จ)   107486 non-null  int64 
 23  60~69์„ธ (๋‚จ)   107486 non-null  int64 
 24  70~79์„ธ (๋‚จ)   107486 non-null  int64 
 25  80~89์„ธ (๋‚จ)   107486 non-null  int64 
 26  90~99์„ธ (๋‚จ)   107486 non-null  int64 
 27  100์„ธ ์ด์ƒ (๋‚จ)  107486 non-null  int64 
 28  ์ด ์ธ๊ตฌ์ˆ˜ (์—ฌ)    107486 non-null  int64 
 29  0~9์„ธ (์—ฌ)     107486 non-null  int64 
 30  10~19์„ธ (์—ฌ)   107486 non-null  int64 
 31  20~29์„ธ (์—ฌ)   107486 non-null  int64 
 32  30~39์„ธ (์—ฌ)   107486 non-null  int64 
 33  40~49์„ธ (์—ฌ)   107486 non-null  int64 
 34  50~59์„ธ (์—ฌ)   107486 non-null  int64 
 35  60~69์„ธ (์—ฌ)   107486 non-null  int64 
 36  70~79์„ธ (์—ฌ)   107486 non-null  int64 
 37  80~89์„ธ (์—ฌ)   107486 non-null  int64 
 38  90~99์„ธ (์—ฌ)   107486 non-null  int64 
 39  100์„ธ ์ด์ƒ (์—ฌ)  107486 non-null  int64 
dtypes: int64(38), object(2)
memory usage: 32.8+ MB
  • ์ตœ์‹  ๋ฐ์ดํ„ฐ์ธ 2022๋…„ 9์›”์˜ ์ด ์ธ๊ตฌ์ˆ˜๋งŒ ๋‚จ๊น€
Code
popu = popu[(popu['์—ฐ๋„'] == 2022) & (popu['์›”'] == 9)][['ํ–‰์ •๊ตฌ์—ญ๋ช…', '์ด ์ธ๊ตฌ์ˆ˜']]
popu.head(5)
ํ–‰์ •๊ตฌ์—ญ๋ช… ์ด ์ธ๊ตฌ์ˆ˜
0 ๊ฒฝ๊ธฐ๋„ 13574353
1 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ 62168
2 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ๊ฐ€ํ‰์ 19532
3 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ๋ถ๋ฉด 3821
4 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ์ƒ๋ฉด 5688
  • ์ฒซ ๋ฒˆ์งธ ํ–‰ ์‚ญ์ œ. ๊ฒฝ๊ธฐ๋„ ์ „์ฒด ์ธ๊ตฌ์ด๊ธฐ ๋•Œ๋ฌธ์—
Code
popu = popu.drop(0, axis=0)
popu.head(3)
ํ–‰์ •๊ตฌ์—ญ๋ช… ์ด ์ธ๊ตฌ์ˆ˜
1 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ 62168
2 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ๊ฐ€ํ‰์ 19532
3 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ๋ถ๋ฉด 3821

์ด ์ธ๊ตฌ์ˆ˜๊ฐ€ 0์ธ ํ–‰ ์‚ญ์ œ

Code
print('์ด ์ธ๊ตฌ์ˆ˜๊ฐ€ 0์ธ ํ–‰ ๊ฐœ์ˆ˜: ', len(popu[popu['์ด ์ธ๊ตฌ์ˆ˜'] == 0]))
์ด ์ธ๊ตฌ์ˆ˜๊ฐ€ 0์ธ ํ–‰ ๊ฐœ์ˆ˜:  5
Code
mask = popu['์ด ์ธ๊ตฌ์ˆ˜'] == 0
popu = popu[~mask]
print('์ด ์ธ๊ตฌ์ˆ˜๊ฐ€ 0์ธ ํ–‰ ๊ฐœ์ˆ˜: ', len(popu[popu['์ด ์ธ๊ตฌ์ˆ˜'] == 0]))
์ด ์ธ๊ตฌ์ˆ˜๊ฐ€ 0์ธ ํ–‰ ๊ฐœ์ˆ˜:  0

์ค‘๋ณต๊ฐ’ ์—ฌ๋ถ€ ํ™•์ธ

Code
popu['ํ–‰์ •๊ตฌ์—ญ๋ช…'].duplicated().sum()
0

์‹œ๊ตฐ๊ตฌ ๋ฌธ์ž์—ด์„ ๊ณต๋ฐฑ์œผ๋กœ ๋ถ„๋ฆฌ <- ์ทจ์†Œ

  • ๋ฌธ์ž์—ด์„ ๊ณต๋ฐฑ์œผ๋กœ ๋ถ„๋ฆฌํ•œ ํ›„ ๋ณ„๋„์˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์œผ๋กœ ๋ฐ˜ํ™˜
  • ํ–‰์ •๊ตฌ์—ญ๋ช…์—์„œ ์‹œ๊ตฐ๋ช…๋งŒ ๋นผ๋‚ด๊ธฐ ์œ„ํ•ด ๊ณต๋ฐฑ์„ ๊ธฐ์ค€์œผ๋กœ ๋ถ„๋ฆฌํ•œ ํ›„ ๋ณ„๋„์˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์œผ๋กœ ๋งŒ๋“ฆ
  • 0๋ฒˆ ์—ด์—๋Š” ๊ฒฝ๊ธฐ๋„, 1๋ฒˆ ์—ด์—๋Š” ์‹œ๊ตฐ๋ช…, 2๋ฒˆ ์—ด์—๋Š” ์๋ฉด๋™
  • 1๋ฒˆ ์—ด๋งŒ ๋‚ด์šฉ ํ™•์ธ

ํ”„๋กœ์ ํŠธ๊ฐ€ ๋๋‚˜๊ณ  ๋‹ค์‹œ ๋ณด๋‹ค ์ž˜๋ชป ์ฒ˜๋ฆฌํ•œ ๊ฑธ ๋ฐœ๊ฒฌํ•จ. ์‹œ๊ตฐ๊ตฌ ๋ฐ์ดํ„ฐ๊ฐ€ ์•„๋ž˜ ์๋ฉด๋™ ๋ฐ์ดํ„ฐ๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์„ ๊ฐ„๊ณผํ•˜๊ณ  ๋ชจ๋‘ ํ•ฉํ•จ ์ฆ‰, ์ธ๊ตฌ๊ฐ€ 2๋ฐฐ๊ฐ€ ๋จ

Code
# tmp = popu['ํ–‰์ •๊ตฌ์—ญ๋ช…'].str.split(pat=' ', expand=True)
# tmp[1].unique() , len(tmp[1].unique())

ํ–‰์ •๊ตฌ์—ญ๋ช…์ด 3๋‹จ์–ด๋กœ ๋˜์–ด ์žˆ๋Š” ํ–‰(์๋ฉด๋™๊นŒ์ง€ ์žˆ๋Š” ๊ฒƒ)์— false๋ฅผ ๋ถ€์—ฌํ•˜์—ฌ ์ œ์™ธํ•จ

  • apply ํ•จ์ˆ˜๋กœ โ€˜ํ–‰์ •๊ตฌ์—ญ๋ช…โ€™ ์—ด์˜ ๋ฌธ์ž์—ด์„ split์œผ๋กœ ๋ถ„๋ฆฌํ•ด ๋ฐ˜ํ™˜๋œ ๋ฆฌ์ŠคํŠธ ์š”์†Œ ๊ฐœ์ˆ˜๊ฐ€ 2์ธ ๊ฒƒ๋งŒ True ๋ถ€์—ฌ
  • boolean indexing์œผ๋กœ ์‹œ๊ตฐ๊นŒ์ง€์˜ ์ธ๊ตฌํ†ต๊ณ„๋งŒ ๋‚จ๊น€
Code
popu = popu[popu['ํ–‰์ •๊ตฌ์—ญ๋ช…'].apply(lambda x: len(x.split())) == 2]
popu
ํ–‰์ •๊ตฌ์—ญ๋ช… ์ด ์ธ๊ตฌ์ˆ˜
1 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ 62168
8 ๊ฒฝ๊ธฐ๋„ ๊ณ ์–‘์‹œ 1079277
56 ๊ฒฝ๊ธฐ๋„ ๊ณผ์ฒœ์‹œ 78301
63 ๊ฒฝ๊ธฐ๋„ ๊ด‘๋ช…์‹œ 290756
83 ๊ฒฝ๊ธฐ๋„ ๊ด‘์ฃผ์‹œ 388893
97 ๊ฒฝ๊ธฐ๋„ ๊ตฌ๋ฆฌ์‹œ 191011
106 ๊ฒฝ๊ธฐ๋„ ๊ตฐํฌ์‹œ 267493
119 ๊ฒฝ๊ธฐ๋„ ๊น€ํฌ์‹œ 485609
134 ๊ฒฝ๊ธฐ๋„ ๋‚จ์–‘์ฃผ์‹œ 734642
153 ๊ฒฝ๊ธฐ๋„ ๋™๋‘์ฒœ์‹œ 93260
162 ๊ฒฝ๊ธฐ๋„ ๋ถ€์ฒœ์‹œ 801503
173 ๊ฒฝ๊ธฐ๋„ ์„ฑ๋‚จ์‹œ 928267
228 ๊ฒฝ๊ธฐ๋„ ์ˆ˜์›์‹œ 1188234
277 ๊ฒฝ๊ธฐ๋„ ์‹œํฅ์‹œ 512721
297 ๊ฒฝ๊ธฐ๋„ ์•ˆ์‚ฐ์‹œ 650708
325 ๊ฒฝ๊ธฐ๋„ ์•ˆ์„ฑ์‹œ 189772
341 ๊ฒฝ๊ธฐ๋„ ์•ˆ์–‘์‹œ 549724
376 ๊ฒฝ๊ธฐ๋„ ์–‘์ฃผ์‹œ 236142
388 ๊ฒฝ๊ธฐ๋„ ์–‘ํ‰๊ตฐ 121789
401 ๊ฒฝ๊ธฐ๋„ ์—ฌ์ฃผ์‹œ 112410
414 ๊ฒฝ๊ธฐ๋„ ์—ฐ์ฒœ๊ตฐ 42731
425 ๊ฒฝ๊ธฐ๋„ ์˜ค์‚ฐ์‹œ 230028
432 ๊ฒฝ๊ธฐ๋„ ์šฉ์ธ์‹œ 1075877
474 ๊ฒฝ๊ธฐ๋„ ์˜์™•์‹œ 162137
481 ๊ฒฝ๊ธฐ๋„ ์˜์ •๋ถ€์‹œ 464358
496 ๊ฒฝ๊ธฐ๋„ ์ด์ฒœ์‹œ 222722
511 ๊ฒฝ๊ธฐ๋„ ํŒŒ์ฃผ์‹œ 486515
529 ๊ฒฝ๊ธฐ๋„ ํ‰ํƒ์‹œ 569369
555 ๊ฒฝ๊ธฐ๋„ ํฌ์ฒœ์‹œ 148469
571 ๊ฒฝ๊ธฐ๋„ ํ•˜๋‚จ์‹œ 323558
586 ๊ฒฝ๊ธฐ๋„ ํ™”์„ฑ์‹œ 892038
  • ํ–‰์ •๊ตฌ์—ญ๋ช… ๋ฌธ์ž์—ด์„ ๋นˆ ์นธ ๊ธฐ์ค€์œผ๋กœ ๋‚˜๋ˆ  ๋‘ ๋ฒˆ์งธ ๊ฐ’๋งŒ ๋‹ค์‹œ ํ–‰์ •๊ตฌ์—ญ๋ช…์œผ๋กœ ์ž…๋ ฅ
Code
popu['ํ–‰์ •๊ตฌ์—ญ๋ช…'] = popu['ํ–‰์ •๊ตฌ์—ญ๋ช…'].apply(lambda x: x.split()[1])
popu.head()
ํ–‰์ •๊ตฌ์—ญ๋ช… ์ด ์ธ๊ตฌ์ˆ˜
1 ๊ฐ€ํ‰๊ตฐ 62168
8 ๊ณ ์–‘์‹œ 1079277
56 ๊ณผ์ฒœ์‹œ 78301
63 ๊ด‘๋ช…์‹œ 290756
83 ๊ด‘์ฃผ์‹œ 388893

๊ธฐ์กด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„๊ณผ inner join

  • ๊ธฐ์กด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์— ์ธ๊ตฌ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ๋ณ‘ํ•ฉ
Code
df = pet_num_si.merge(popu, how='inner', left_on='์‹œ๊ตฐ๋ช…', right_on='ํ–‰์ •๊ตฌ์—ญ๋ช…')
df.head()
์‹œ๊ตฐ๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๊ด€๊ณต์„œ ์œ„๋„ ๊ฒฝ๋„ ํ–‰์ •๊ตฌ์—ญ๋ช… ์ด ์ธ๊ตฌ์ˆ˜
0 ๊ฐ€ํ‰๊ตฐ 4017 2006.0 ๊ฐ€ํ‰๊ตฐ์ฒญ 37.831318 127.509706 ๊ฐ€ํ‰๊ตฐ 62168
1 ๊ณ ์–‘์‹œ 73477 54580.0 ๊ณ ์–‘์‹œ์ฒญ 37.658422 126.831964 ๊ณ ์–‘์‹œ 1079277
2 ๊ณผ์ฒœ์‹œ 2974 2325.0 ๊ณผ์ฒœ์‹œ์ฒญ 37.429812 126.986963 ๊ณผ์ฒœ์‹œ 78301
3 ๊ด‘๋ช…์‹œ 20161 15698.0 ๊ด‘๋ช…์‹œ์ฒญ 37.479097 126.864846 ๊ด‘๋ช…์‹œ 290756
4 ๊ด‘์ฃผ์‹œ 29042 19368.0 ๊ด‘์ฃผ์‹œ์ฒญ 37.429433 127.255084 ๊ด‘์ฃผ์‹œ 388893
  • ์ค‘๋ณต ์—ด ์‚ญ์ œ
Code
df = df.drop('ํ–‰์ •๊ตฌ์—ญ๋ช…', axis=1)
df.head()
์‹œ๊ตฐ๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๊ด€๊ณต์„œ ์œ„๋„ ๊ฒฝ๋„ ์ด ์ธ๊ตฌ์ˆ˜
0 ๊ฐ€ํ‰๊ตฐ 4017 2006.0 ๊ฐ€ํ‰๊ตฐ์ฒญ 37.831318 127.509706 62168
1 ๊ณ ์–‘์‹œ 73477 54580.0 ๊ณ ์–‘์‹œ์ฒญ 37.658422 126.831964 1079277
2 ๊ณผ์ฒœ์‹œ 2974 2325.0 ๊ณผ์ฒœ์‹œ์ฒญ 37.429812 126.986963 78301
3 ๊ด‘๋ช…์‹œ 20161 15698.0 ๊ด‘๋ช…์‹œ์ฒญ 37.479097 126.864846 290756
4 ๊ด‘์ฃผ์‹œ 29042 19368.0 ๊ด‘์ฃผ์‹œ์ฒญ 37.429433 127.255084 388893

์ธ๊ตฌ ๋‹น ๋“ฑ๋ก ๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜

Code
( df['๋“ฑ๋ก๋™๋ฌผ์ˆ˜'] / df['์ด ์ธ๊ตฌ์ˆ˜'] )
0     0.064615
1     0.068080
2     0.037982
3     0.069340
4     0.074679
5     0.064818
6     0.057871
7     0.016293
8     0.058473
9     0.075038
10    0.142098
11    0.060034
12    0.054981
13    0.059563
14    0.132407
15    0.105126
16    0.064840
17    0.050173
18    0.109131
19    0.046722
20    0.047272
21    0.062910
22    0.046685
23    0.040645
24    0.127051
25    0.059029
26    0.049448
27    0.106202
28    0.075221
29    0.054500
30    0.054105
dtype: float64
  • ์ˆซ์ž ํฌ๊ธฐ๊ฐ€ ๋„ˆ๋ฌด ์ž‘์•„ ์ธ๊ตฌ 1000๋ช… ๋‹น ๋งˆ๋ฆฌ์ˆ˜ ์‚ฐ์ถœ
  • ๊ฐ€์žฅ ์ˆซ์ž๊ฐ€ ๋งŽ์€ 5๊ฐœ ์‹œ๊ตฐ๋งŒ ์ถœ๋ ฅ df.nlargest(๊ฐœ์ˆ˜, โ€˜์—ด์ด๋ฆ„โ€™)
Code
df['1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜'] = np.round( (df['๋“ฑ๋ก๋™๋ฌผ์ˆ˜'] / df['์ด ์ธ๊ตฌ์ˆ˜']) * 1000, 2)
df.nlargest(5, '1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜')
์‹œ๊ตฐ๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๊ด€๊ณต์„œ ์œ„๋„ ๊ฒฝ๋„ ์ด ์ธ๊ตฌ์ˆ˜ 1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜
10 ๋ถ€์ฒœ์‹œ 113892 86532.0 ๋ถ€์ฒœ์‹œ์ฒญ 37.502902 126.765889 801503 142.10
14 ์•ˆ์‚ฐ์‹œ 86158 64634.0 ์•ˆ์‚ฐ์‹œ์ฒญ 37.322131 126.830243 650708 132.41
24 ์˜์ •๋ถ€์‹œ 58997 0.0 ์˜์ •๋ถ€์‹œ์ฒญ 37.738063 127.033840 464358 127.05
18 ์–‘ํ‰๊ตฐ 13291 7308.0 ์–‘ํ‰๊ตฐ์ฒญ 37.491741 127.487682 121789 109.13
27 ํ‰ํƒ์‹œ 60468 43718.0 ํ‰ํƒ์‹œ์ฒญ 36.992300 127.112527 569369 106.20
  • ์†Œ์œ ์ž ๋‹น ๋งˆ๋ฆฌ ์ˆ˜
  • ์˜์ •๋ถ€์‹œ๋Š” ์†Œ์œ ์ž ์ •๋ณด๊ฐ€ ์—†์Œ
Code
df['1์†Œ์œ ์ž๋‹น๋งˆ๋ฆฌ์ˆ˜'] = np.round(df['๋“ฑ๋ก๋™๋ฌผ์ˆ˜'] / df['๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜'], 2)
df.nlargest(5, '1์†Œ์œ ์ž๋‹น๋งˆ๋ฆฌ์ˆ˜')
์‹œ๊ตฐ๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๊ด€๊ณต์„œ ์œ„๋„ ๊ฒฝ๋„ ์ด ์ธ๊ตฌ์ˆ˜ 1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜ 1์†Œ์œ ์ž๋‹น๋งˆ๋ฆฌ์ˆ˜
24 ์˜์ •๋ถ€์‹œ 58997 0.0 ์˜์ •๋ถ€์‹œ์ฒญ 37.738063 127.033840 464358 127.05 inf
0 ๊ฐ€ํ‰๊ตฐ 4017 2006.0 ๊ฐ€ํ‰๊ตฐ์ฒญ 37.831318 127.509706 62168 64.62 2.00
28 ํฌ์ฒœ์‹œ 11168 5719.0 ํฌ์ฒœ์‹œ์ฒญ 37.894701 127.200341 148469 75.22 1.95
18 ์–‘ํ‰๊ตฐ 13291 7308.0 ์–‘ํ‰๊ตฐ์ฒญ 37.491741 127.487682 121789 109.13 1.82
19 ์—ฌ์ฃผ์‹œ 5252 3365.0 ์—ฌ์ฃผ์‹œ์ฒญ 37.298214 127.636623 112410 46.72 1.56

๊ฐ€๋กœ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„๋กœ ์‹œ๊ฐํ™”

  • subplots ๊ทธ๋ž˜ํ”„ ๊ทธ๋ฆฌ๊ธฐ
Code
petall_graph = df.sort_values('๋“ฑ๋ก๋™๋ฌผ์ˆ˜')
pet_graph = df.sort_values('1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜')

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(20,10))

ax[0].barh(petall_graph['์‹œ๊ตฐ๋ช…'], petall_graph['๋“ฑ๋ก๋™๋ฌผ์ˆ˜'])
ax[0].set_title('๋“ฑ๋ก ๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜')

ax[1].barh(pet_graph['์‹œ๊ตฐ๋ช…'], pet_graph['1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜'])
ax[1].set_title('1000๋ช… ๋‹น ๋งˆ๋ฆฌ์ˆ˜')

plt.show()

์ฝ”๋กœํ”Œ๋ ˆ์Šค ์ง€๋„(Choropleth map) ๋งŒ๋“ค๊ธฐ

  • ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์˜ โ€˜์‹œ๊ตฐ๋ช…โ€™๊ณผ json ํŒŒ์ผ์˜ โ€™sggnmโ€™ ํ‚ค์˜ ๊ฐ’์ด ๊ฐ™์•„์•ผ ์ง€๋„์— ํ‘œ์‹œ๋จ. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ๊นŒ๋งฃ๊ฒŒ ๋‚˜์˜ด

๊ฒฝ๊ธฐ๋„ ์‹œ๊ตฐ๋ณ„ ๋“ฑ๋ก๋œ ๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜

Code
m = folium.Map(
    location = [37.528043, 126.980238],
    zoom_start = 8)

lat = list(df['์œ„๋„'])
long = list(df['๊ฒฝ๋„'])
name = list(df['๋“ฑ๋ก๋™๋ฌผ์ˆ˜'])
si = list(df['์‹œ๊ตฐ๋ช…'])

for i in range(len(lat)):
    folium.Marker(
            [lat[i], long[i]], tooltip=f'<b>{si[i]}</b><br>๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜: {str(name[i])}',
            icon=folium.Icon(color='red')
            ).add_to(m)

    
geo_data = json.load(open('data/kyeong.geojson'))

folium.Choropleth(
    geo_data = geo_data,
    data = df,
    columns= ['์‹œ๊ตฐ๋ช…', '๋“ฑ๋ก๋™๋ฌผ์ˆ˜'],
    key_on = 'feature.properties.sggnm',
    fill_color = 'Reds',
    fill_opacity = 0.7,
    line_opacity = 0.4,
    legend_name = '์‹œ๊ตฐ๋ณ„ ๋“ฑ๋ก ๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜(๋งˆ๋ฆฌ))'
).add_to(m)

folium.LayerControl().add_to(m)

m
Make this Notebook Trusted to load map: File -> Trust Notebook

๊ฒฝ๊ธฐ๋„ ์‹œ๊ตฐ๋ณ„ ์ธ๊ตฌ 1000๋ช… ๋‹น ๋“ฑ๋ก๋œ ๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜

Code
m = folium.Map(
    location = [37.528043, 126.980238],
    zoom_start = 8)

lat = list(df['์œ„๋„'])
long = list(df['๊ฒฝ๋„'])
name = list(df['๋“ฑ๋ก๋™๋ฌผ์ˆ˜'])
si = list(df['์‹œ๊ตฐ๋ช…'])

for i in range(len(lat)):
    folium.Marker(
            [lat[i], long[i]], tooltip=f'<b>{si[i]}</b><br>1000๋ช… ๋‹น ๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜: {str(name[i])}',
            icon=folium.Icon(color='red')
            ).add_to(m)

    
geo_data = json.load(open('data/kyeong.geojson'))

folium.Choropleth(
    geo_data = geo_data,
    data = df,
    columns= ['์‹œ๊ตฐ๋ช…', '1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜'],
    key_on = 'feature.properties.sggnm',
    fill_color = 'Reds',
    fill_opacity = 0.7,
    line_opacity = 0.4,
    legend_name = '์ธ๊ตฌ 1000๋ช… ๋‹น ๋“ฑ๋ก ๋ฐ˜๋ ค๋™๋ฌผ ์ˆ˜(๋งˆ๋ฆฌ))'
).add_to(m)

folium.LayerControl().add_to(m)

m
Make this Notebook Trusted to load map: File -> Trust Notebook

์‹œ๊ตฐ๋ณ„ ํ‰๊ท  ์—ฐ๋ น ๋ฐ์ดํ„ฐ ์ถ”๊ฐ€

Code
age = pd.read_csv('data/ํ‰๊ท ์—ฐ๋ น์ง‘๊ณ„ํ˜„ํ™ฉ.csv')
age.head()
์—ฐ๋„ ์›” ํ–‰์ •๊ตฌ์—ญ๊ตฌ๋ถ„๋ช… ํ–‰์ •๊ตฌ์—ญ๋ช… ๋‚จ์ž ํ‰๊ท ์—ฐ๋ น ์—ฌ์ž ํ‰๊ท ์—ฐ๋ น ํ‰๊ท ์—ฐ๋ น
0 2022 9 ๋„ ๊ฒฝ๊ธฐ๋„ 41.4 43.2 42.3
1 2022 9 ์‹œ๊ตฐ ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ 49.1 51.6 50.3
2 2022 9 ์๋ฉด๋™ ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ๊ฐ€ํ‰์ 46.9 49.7 48.3
3 2022 9 ์๋ฉด๋™ ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ๋ถ๋ฉด 55.0 57.9 56.4
4 2022 9 ์๋ฉด๋™ ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ์ƒ๋ฉด 52.8 56.9 54.7
  • ๊ฒฐ์ธก์น˜ ํ™•์ธ
Code
age.isna().sum()
์—ฐ๋„         0
์›”          0
ํ–‰์ •๊ตฌ์—ญ๊ตฌ๋ถ„๋ช…    0
ํ–‰์ •๊ตฌ์—ญ๋ช…      0
๋‚จ์ž ํ‰๊ท ์—ฐ๋ น    0
์—ฌ์ž ํ‰๊ท ์—ฐ๋ น    0
ํ‰๊ท ์—ฐ๋ น       0
dtype: int64
  • 2022๋…„ 9์›”์˜ ์ผ๋ถ€ ์ปฌ๋Ÿผ๋งŒ ์„ ํƒ
Code
age = age[(age['์—ฐ๋„'] == 2022) & (age['์›”'] == 9)][['ํ–‰์ •๊ตฌ์—ญ๋ช…', 'ํ‰๊ท ์—ฐ๋ น', '๋‚จ์ž ํ‰๊ท ์—ฐ๋ น', '์—ฌ์ž ํ‰๊ท ์—ฐ๋ น']]
age.head(5)
ํ–‰์ •๊ตฌ์—ญ๋ช… ํ‰๊ท ์—ฐ๋ น ๋‚จ์ž ํ‰๊ท ์—ฐ๋ น ์—ฌ์ž ํ‰๊ท ์—ฐ๋ น
0 ๊ฒฝ๊ธฐ๋„ 42.3 41.4 43.2
1 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ 50.3 49.1 51.6
2 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ๊ฐ€ํ‰์ 48.3 46.9 49.7
3 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ๋ถ๋ฉด 56.4 55.0 57.9
4 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ ์ƒ๋ฉด 54.7 52.8 56.9
  • ์‹œ๊ตฐ๊ตฌ๊นŒ์ง€์˜ ํ–‰์ •๊ตฌ์—ญ๋ช…๋งŒ ๋‚จ๊ธฐ๊ณ  ๋ชจ๋‘ ์‚ญ์ œ
Code
age = age[age['ํ–‰์ •๊ตฌ์—ญ๋ช…'].apply(lambda x: len(x.split())) == 2]
age.head()
ํ–‰์ •๊ตฌ์—ญ๋ช… ํ‰๊ท ์—ฐ๋ น ๋‚จ์ž ํ‰๊ท ์—ฐ๋ น ์—ฌ์ž ํ‰๊ท ์—ฐ๋ น
1 ๊ฒฝ๊ธฐ๋„ ๊ฐ€ํ‰๊ตฐ 50.3 49.1 51.6
8 ๊ฒฝ๊ธฐ๋„ ๊ณ ์–‘์‹œ 42.9 42.0 43.9
56 ๊ฒฝ๊ธฐ๋„ ๊ณผ์ฒœ์‹œ 41.5 40.4 42.5
63 ๊ฒฝ๊ธฐ๋„ ๊ด‘๋ช…์‹œ 42.7 41.6 43.8
83 ๊ฒฝ๊ธฐ๋„ ๊ด‘์ฃผ์‹œ 42.6 42.0 43.3
  • ํ–‰์ •๊ตฌ์—ญ๋ช…์—์„œ ๊ฒฝ๊ธฐ๋„ ์‚ญ์ œ
Code
age['ํ–‰์ •๊ตฌ์—ญ๋ช…'] = age['ํ–‰์ •๊ตฌ์—ญ๋ช…'].apply(lambda x: x.split()[1])
age.tail()
ํ–‰์ •๊ตฌ์—ญ๋ช… ํ‰๊ท ์—ฐ๋ น ๋‚จ์ž ํ‰๊ท ์—ฐ๋ น ์—ฌ์ž ํ‰๊ท ์—ฐ๋ น
509 ํŒŒ์ฃผ์‹œ 41.7 40.8 42.7
527 ํ‰ํƒ์‹œ 40.9 40.0 41.9
553 ํฌ์ฒœ์‹œ 47.5 46.4 48.8
568 ํ•˜๋‚จ์‹œ 40.7 40.2 41.2
583 ํ™”์„ฑ์‹œ 38.1 37.7 38.5

๊ธฐ์กด ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„๊ณผ inner join

Code
pet_man = df.merge(age, how='inner', left_on='์‹œ๊ตฐ๋ช…', right_on='ํ–‰์ •๊ตฌ์—ญ๋ช…')
pet_man.head()
์‹œ๊ตฐ๋ช… ๋“ฑ๋ก๋™๋ฌผ์ˆ˜ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๊ด€๊ณต์„œ ์œ„๋„ ๊ฒฝ๋„ ์ด ์ธ๊ตฌ์ˆ˜ 1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜ 1์†Œ์œ ์ž๋‹น๋งˆ๋ฆฌ์ˆ˜ ํ–‰์ •๊ตฌ์—ญ๋ช… ํ‰๊ท ์—ฐ๋ น ๋‚จ์ž ํ‰๊ท ์—ฐ๋ น ์—ฌ์ž ํ‰๊ท ์—ฐ๋ น
0 ๊ฐ€ํ‰๊ตฐ 4017 2006.0 ๊ฐ€ํ‰๊ตฐ์ฒญ 37.831318 127.509706 62168 64.62 2.00 ๊ฐ€ํ‰๊ตฐ 50.3 49.1 51.6
1 ๊ณ ์–‘์‹œ 73477 54580.0 ๊ณ ์–‘์‹œ์ฒญ 37.658422 126.831964 1079277 68.08 1.35 ๊ณ ์–‘์‹œ 42.9 42.0 43.9
2 ๊ณผ์ฒœ์‹œ 2974 2325.0 ๊ณผ์ฒœ์‹œ์ฒญ 37.429812 126.986963 78301 37.98 1.28 ๊ณผ์ฒœ์‹œ 41.5 40.4 42.5
3 ๊ด‘๋ช…์‹œ 20161 15698.0 ๊ด‘๋ช…์‹œ์ฒญ 37.479097 126.864846 290756 69.34 1.28 ๊ด‘๋ช…์‹œ 42.7 41.6 43.8
4 ๊ด‘์ฃผ์‹œ 29042 19368.0 ๊ด‘์ฃผ์‹œ์ฒญ 37.429433 127.255084 388893 74.68 1.50 ๊ด‘์ฃผ์‹œ 42.6 42.0 43.3
  • ํ–‰์ •๊ตฌ์—ญ๋ช… ์—ด ์‚ญ์ œ ํ›„ โ€™์‹œ๊ตฐ๋ช…โ€™์„ ์ธ๋ฑ์Šค๋กœ ์„ค์ •
Code
pet_man = pet_man.drop('ํ–‰์ •๊ตฌ์—ญ๋ช…', axis=1)
pet_man = pet_man.set_index('์‹œ๊ตฐ๋ช…')
pet_man.tail()
๋“ฑ๋ก๋™๋ฌผ์ˆ˜ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ๊ด€๊ณต์„œ ์œ„๋„ ๊ฒฝ๋„ ์ด ์ธ๊ตฌ์ˆ˜ 1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜ 1์†Œ์œ ์ž๋‹น๋งˆ๋ฆฌ์ˆ˜ ํ‰๊ท ์—ฐ๋ น ๋‚จ์ž ํ‰๊ท ์—ฐ๋ น ์—ฌ์ž ํ‰๊ท ์—ฐ๋ น
์‹œ๊ตฐ๋ช…
ํŒŒ์ฃผ์‹œ 24057 16584.0 ํŒŒ์ฃผ์‹œ์ฒญ 37.760041 126.779877 486515 49.45 1.45 41.7 40.8 42.7
ํ‰ํƒ์‹œ 60468 43718.0 ํ‰ํƒ์‹œ์ฒญ 36.992300 127.112527 569369 106.20 1.38 40.9 40.0 41.9
ํฌ์ฒœ์‹œ 11168 5719.0 ํฌ์ฒœ์‹œ์ฒญ 37.894701 127.200341 148469 75.22 1.95 47.5 46.4 48.8
ํ•˜๋‚จ์‹œ 17634 13085.0 ํ•˜๋‚จ์‹œ์ฒญ 37.539057 127.215530 323558 54.50 1.35 40.7 40.2 41.2
ํ™”์„ฑ์‹œ 48264 35496.0 ํ™”์„ฑ์‹œ์ฒญ 37.199415 126.831523 892038 54.11 1.36 38.1 37.7 38.5

์ƒ๊ด€๊ด€๊ณ„

  • ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๋ณด๋‹ˆ ํฅ๋ฏธ๋กœ์šด ์ ์ด ๋ช‡ ๊ฐ€์ง€ ๋ณด์ธ๋‹ค.
  • ๋“ฑ๋ก๋œ ๋ฐ˜๋ ค๋™๋ฌผ์ˆ˜๊ฐ€ ๊ฒฝ๋„์™€ 40% ์—ญ์˜ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๋ณด์ธ๋‹ค. ๋™์ชฝ๋ณด๋‹ค๋Š” ์„œ์ชฝ์—์„œ ๋งŽ์ด ํ‚ค์šด๋‹ค๋Š” ๊ฑด๊ฐ€? ์ธ๊ตฌ๊ฐ€ ์„œ์ชฝ์— ๋ชฐ๋ ค์žˆ์œผ๋‹ˆ ๊ทธ๋Ÿด ์ˆ˜๋„ ์žˆ๊ฒ ๋‹ค.
  • ํ‰๊ท  ์—ฐ๋ น๊ณผ 33% ์—ญ์˜ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๋ณด์ธ๋‹ค. ๋‚˜์ด๊ฐ€ ์–ด๋ฆด์ˆ˜๋ก ๋™๋ฌผ์„ ๋“ฑ๋กํ•ด์„œ ํ‚ค์šฐ๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค๊ณ  ๋ณผ ์ˆ˜๋„ ์žˆ๊ฒ ๋‹ค.
  • ๋‚จ๋…€ ์„ฑ๋ณ„ ์ฐจ์ด๋Š” ์—†๋‹ค
Code
pet_man.corr()
๋“ฑ๋ก๋™๋ฌผ์ˆ˜ ๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ ์œ„๋„ ๊ฒฝ๋„ ์ด ์ธ๊ตฌ์ˆ˜ 1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜ 1์†Œ์œ ์ž๋‹น๋งˆ๋ฆฌ์ˆ˜ ํ‰๊ท ์—ฐ๋ น ๋‚จ์ž ํ‰๊ท ์—ฐ๋ น ์—ฌ์ž ํ‰๊ท ์—ฐ๋ น
๋“ฑ๋ก๋™๋ฌผ์ˆ˜ 1.000000 0.929165 -0.258325 -0.400649 0.801469 0.603257 -0.320633 -0.337281 -0.335284 -0.336903
๋“ฑ๋ก์†Œ์œ ์ž์ˆ˜ 0.929165 1.000000 -0.326905 -0.404202 0.802641 0.444666 -0.352341 -0.355903 -0.351853 -0.354996
์œ„๋„ -0.258325 -0.326905 1.000000 -0.054979 -0.287853 -0.132644 0.376955 0.531682 0.521886 0.535785
๊ฒฝ๋„ -0.400649 -0.404202 -0.054979 1.000000 -0.406309 0.010427 0.624281 0.573513 0.594505 0.554759
์ด ์ธ๊ตฌ์ˆ˜ 0.801469 0.802641 -0.287853 -0.406309 1.000000 0.084719 -0.428738 -0.529098 -0.522676 -0.530333
1000๋ช…๋‹น๋งˆ๋ฆฌ์ˆ˜ 0.603257 0.444666 -0.132644 0.010427 0.084719 1.000000 0.197501 0.184070 0.188544 0.180885
1์†Œ์œ ์ž๋‹น๋งˆ๋ฆฌ์ˆ˜ -0.320633 -0.352341 0.376955 0.624281 -0.428738 0.197501 1.000000 0.756414 0.785313 0.735683
ํ‰๊ท ์—ฐ๋ น -0.337281 -0.355903 0.531682 0.573513 -0.529098 0.184070 0.756414 1.000000 0.996922 0.997771
๋‚จ์ž ํ‰๊ท ์—ฐ๋ น -0.335284 -0.351853 0.521886 0.594505 -0.522676 0.188544 0.785313 0.996922 1.000000 0.989792
์—ฌ์ž ํ‰๊ท ์—ฐ๋ น -0.336903 -0.354996 0.535785 0.554759 -0.530333 0.180885 0.735683 0.997771 0.989792 1.000000

์ตœ์ข… ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ csv ํŒŒ์ผ๋กœ ์ €์žฅ

Code
pet_man.to_csv('pet_man.csv', encoding='utf-8-sig')

์ฐจํ›„ ๊ณผ์ œ

  • ์ „๊ตญ ๋ฐ์ดํ„ฐ๋กœ ์ž‘์—…์„ ํ•˜๋ฉด ๋” ์žฌ๋ฏธ์žˆ๋Š” ๊ฒฐ๊ณผ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค.
  • ํŽซ์นดํŽ˜ ๋˜๋Š” ๋ฐ˜๋ ค๋™๋ฌผ ๋™๋ฐ˜ ๊ด€๊ด‘์ง€ ์ •๋ณด๋ฅผ ์ง€๋„์— ํ‘œ์‹œํ•˜๋ฉด ๋ฐ˜๋ ค๋™๋ฌผ์„ ํ‚ค์šฐ๋Š” ์‚ฌ๋žŒ์—๊ฒŒ๋„ ์‚ฌ์—…์„ ํ•˜๋ ค๋Š” ์‚ฌ๋žŒ์—๊ฒŒ๋„ ๋„์›€์ด ๋  ๊ฒƒ ๊ฐ™๋‹ค.
  • ๋ถ€์ฒœ, ์•ˆ์‚ฐ, ์˜์ •๋ถ€๊ฐ€ ํƒ€ ์‹œ๊ตฐ์— ๋น„ํ•ด ๋“ฑ๋ก๋œ ๋ฐ˜๋ ค๋™๋ฌผ์ด ๋งŽ์€ ์ด์œ ๊ฐ€ ๋ญ˜๊นŒ? ์œ ๊ธฐ๋™๋ฌผ ๋ณดํ˜ธ์†Œ๊ฐ€ ์žˆ๋‚˜?